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Abstract 

It is well known that a stable matching in a many-to-one matching market with couples need 
not exist. We introduce a new matching algorithm for such markets and show that for a general 
class of large random markets the algorithm will find a stable matching with high probability. 
In particular we allow the number of couples to grow at a near-linear rate. Furthermore, truth- 
telling is an approximated equilibrium in the game induced by the new matching algorithm. 
Our results are tight: for markets in which the number of couples grows at a linear rate, we 
show that with constant probability no stable matching exists. 



1 Introduction 

We consider a many-to-one matching market, in which one side of the market consists of hospitals 
and the other consists of doctors. Stability is the most natural and desired property in such markets. 
Therefore understanding when a stable matching exists in a matching market with couples as well 
as providing an efficient procedure to find one (whenever it exists) are both important tasks, and 
are the main scope of this paper. 



Gale and Shapley ( 1962 ) introduced the well-known Deferred Acceptance algorithm and showed 
that if doctors' preferences do not depend on other doctors' preferences, in other words all doctors 
are "single", the algorithm will always produce a stable matching. Naturally, when couples are 
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present in the market, their preferences depend on each other and often introduce complementari- 



ties, a stable matching may not exist (Roth (1984)). In fact, for any market size one can construct 
a preference profile for which a stable matching does not exist and even if a stable matching does 
exist, finding it can be computationally intractable (Ronn (1990)). 

Several clearinghouses exist today for two sided markets with couples. Two major examples are 
the National Resident Matching Program (NRMP) and the clearinghouse for psychology interns. 
Until not long ago couples had to participate as singles, since clearinghouses for these markets 
used the Deferred Acceptance algorithm to find a matching. Only since 1999, the NRMP and the 
psychology market adopted the new algorithm designed by Roth and Peranson ( 1999| ) allowing 
for couples to express their joint preferences, henceforth called the Roth-Peranson (RP) algorithm. 
This algorithm has had a great success in practice: every year since it is used, the NRMP has found 
a stable matching with respect to the reported preferences. For a comprehensive background, and 



history of these markets see Kojima et al. (2010); Roth (2009). 



Klaus and Klijn ( 2005 1 initiated the study of characterizing markets with couples that have a 



stable matching. They showed that the domain of responsive preferences is a maximal domain in 



which a stable matching exists. However, Kojima et al. (2010) observe from real data that couples' 
preferences often do not belong to this domain. Adopting a random preferences approach, they 
used a much simplified version of RP (that attempts to find a stable matching) to show that if 
there are n single doctors, and the number of couples is of order y/n a stable matching exists with 
probability converging to one as n approaches infinity. The approach for studying random growing 
markets is well founded^] About 16,000 single doctors and 800 couples participated in the NRMP in 
201C(^J and about 3,000 single doctors and 19 couples participated in the psychology clearinghouse 
in the same year. Furthermore these figures are increasing every year. While the size of the market 
justifies the large-market-assumption, the number of couples increases every year. Thus although 



results by Kojima et al. (2010) can explain the success in the psychology market, the success of the 



NRMP market remains a puzzle as the the number of couples is much larger than ^fn. 

We introduce a new matching algorithm, called Sorted Deferred Acceptance (SoDA), for many- 



1 Scc also 



Klaus et al. 



(2009). 



Immorlica and Mahdian (20051 and Kojima and Pathak (20091 also used a similar large market approach to 



study incentives and stability in a one-to-one and many-to-one matching markets without couples. 

3 In fact there were about 40,000 doctors, but only 16,000 of from American institutions and most couples were 

from American institutions. 
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to-one matching markets with couples. Our approach is slightly different from RP, although the 
algorithms share similar ideas. The SoDA algorithm is simple and consists of two main steps; First 

(i) it finds a stable matching in the sub-market without couples using Deferred Acceptance. Then 

(ii) in some given order, each couple c applies according to its preference list; whenever a single is 
rejected it applies until it finds a position. If some other couple d has been rejected after being 
assigned, the second step starts over, letting however d apply just ahead of c. 

As noted above we study large markets and analyze the performance of the SoDA algorithm 
in these markets. In our model, hospitals have capacities, there is an excess number of available 
positions^} all doctors are acceptable to all hospitals and vice versa, doctors preferences are random 
and hospital preferences are arbitrary. As we will show, when doctors' preference lists are long, an 
excess number of positions is necessary for the existence of stability even if there is only one couple 
in the market^] All our main results hold without restricting the length of doctors' preference lists. 

We first provide positive results for a near-linear rate. If the number of couples grows at a rate 
of at most n 1- "^™) where e(n) is a 'slowly' decreasing function converging to zeroj^] 

1. The probability that a stable matching exists and is found by the SoDA algorithm approaches 
1 (as n approaches infinity). 

2. The probability that any doctor or any couple can gain by misreporting her preferences 
converges to even ex post. A similar result can be shown for hospitals, implying that truth- 
telling is an approximated Bayes Nash equilibrium in the game induced by SoDA for any 
large enough n. 

Note that if e(n) is approximately 1/logn then the growth rate of couples is linear. Our result 
holds for any e{n) = f2 (log log n/^/log n) (see the last section for further discussion)^] 

Our first result is tight in the following sense. When the number of couples grows at a rate of 
an for some a > we show: 

3. For some A > 2a+l, if the number of hospitals is An, with constant probability (not depending 
on n) no stable matching exists (even if hospitals' preferences are random). 



There are An positions for some A > 1. 



■"Kojima et al. (20101 do not assume an excess number of positions. They assume, however, that doctors have 

'short' preference lists, and show that it results in an excess number of positions. 
6 e(n) can be replaced by any fixed e > 0. 

7 We write f(n) = Q(g(n)) if there exists c > such that /(n) > cg(n) for every large enough n. 
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While the third result does not cover the case when the excess is small, we believe that the result 
remains true in this case. We give evidence, based on simulation, that in the setting where the 
number of couples is linear, the probability of failure decreases as a decreases but remains constant 
as n grows. One can thus view our results as a characterization for the existence of a stable matching 
with high probability in a large random market with couples. 

We also show that the SoDA algorithm runs in polynomial time (in fact 'almost' linear), and 
provide simulations that test SoDA in various large random markets. Finally, we believe our proof 
technique is interesting for its own sake, and may serve as a tool for in the search for positive results 
in other settings with complementarities. Some open problems are discussed in the last section. 

SoDA is the first algorithm for matching markets with couples that is proven to find a stable 
outcome under very general settings. The provable success of the SoDA algorithm helps explain 
the fact that algorithms, RP in particular, have been successful in finding stable matchings in real 
lifej^] This adds to the short list of positive results in settings with complementarities (see e.g. 
Milgrom| ( [2004] ), |Gul and Stacchettij Q1999D , |Ning and Yang| fl2006| ) and |Lahaie and Parkes| Q2009| ) 



for auction settings, and Hatfield and Kominers (2009) and Pycia (2010) for matching settings). 



2 Matching Markets with Couples 
2.1 Model 

In a matching market there is a set of hospitals H a set of single doctors S and a set of couples of 

doctors C. Each single doctor s G S has a strict preference relation >- s over the set of hospitals. 

Each couple c G C denoted by c = (/, m) has a strict preference relation >~ c over pairs of hospitals. 

For every couple c we denote by f c and m c the first and second members of c. Denote by D the 

set of all doctors. That is D = S U {m c |c G C} U {/ c |c G C}. Each hospital h G H has a fixed 

capacity kh > and a strict preference relation >~h over the set D. For any set D' C D hospital fo's 

choice given D' , i.e. the most preferred doctors h can employ, CHh(D'), is induced by >-h and kh 

as follows: d G D' n CH h (D') if and only if there exist no set of D" CD' \ {d} such that \D"\ = k h 

and d' >-hd for all d' G D" . 

8 We believe that our techniques can be adapted to prove that directly that the RP algorithm also succeeds with 
high probability in large random markets. 
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A matching // is a function from H U C U S such that fi(s) G H U {0} for every s G 5, 
/i(c) E H X H U {((f), </>)} for every c G C, //(/i) G 2 D for every h £ H, and: 

(i) s G if and only if /i(s) = /i. 

(ii) fi(c) = (h, h!) if and only if f c G fi(h) and m c G fi(h'). 

fi(s) = 4> means that s is unassigned under fi, and similarly fi(c) = ((f), 4>) means that the couple c 
is unassigned under fi. 

We proceed to define stability. Blocking coalitions for a given matching can be formed in several 
ways: 

• (s, h) G S x H is a block of if h >- s /j,(s) and s G CHh(/J-(h) U s). 

• (c, h,h') £ C x H x H (where /i / /i') is a block of [i if (/i, /i') ^ c fi(c), f c G Ch h (n(h) U / c ), 
and m c G Chh'(n(h') U m c ) 

• (c, /i) G C x H is a block of if (h, h) >- c fj,(c) and {/ c , m c } G Chh(^(h) U c). 

Finally a matching is stable if there is no block of fi. 

Gale and Shapley (1962) showed that the (doctor proposing) Deferred Acceptance algorithm 
described below, always produces a stable matching in a matching market without couples. They 
further showed that the stable matching produced by this algorithm is the one that is weakly 
preferred by all single doctors. Roth (1982) showed that the mechanism induced by this algorithm 
makes it a dominant strategy for all single doctors to report their true preferences. 
Doctor-Proposing Deferred Acceptance Algorithm (DA): 

Input: a matching market (H, S,>-h,^s) without couples. 

Step 1: Each single doctors G S applies to her most preferred hospital. Each hospital rejects its 
least preferred doctor in excess of its capacity among those who applied to it, keeping the rest of the 
doctors temporarily. 

Step t: Each doctor who was rejected in Step t-1 applies to her next highest choice if such exists. 
Each hospital considers these doctors as well as the doctors who are temporarily held from the 
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previous step, and rejects the least-preferred doctors in excess of its capacity keeping the rest of the 
doctors temporarily. 

The algorithm terminates at a step where no doctor is rejected. 

In the next section we introduce a new algorithm for finding a matching in a market with couples. 
Roth (1984) showed that when there are couples, sometimes a stable match does not exist. In 
Section [4] we show that this algorithm produces a stable matching with very high probability when 
there is a large market with the number of couples growing (almost) linearly. 

2.2 A New Matching Algorithm 

The matching algorithm that we present here first finds the stable matching in the market with- 
out couples (using DA) and then attempts to insert the couples, while maintaining the deferred 
acceptance idea. 

Informally, the new algorithm receives as input a matching market with couples and does the 
following: 

(i) Find the stable matching in the sub-market without couples using the DA algorithm. 

(ii) Fix an order ir over the couples. Let each couple c on its turn according to ir apply to pairs 
of hospitals according to its preference list y c (beginning with the most preferred) and once 
it finds a pair of hospitals that accepts it, we assign the couple to the pair of hospitals and 
stabilize the current matching as follows: 

Stabilize: Continue the DA algorithm, with the singles that were rejected from the their positions 
in the pair of hospitals that the last couple c was assigned to (at most two singles). 
If during stabilizing one of the members of the last couple c was rejected the algorithm 
fails. Otherwise if some other couple d ^ c was rejected during stabilizing, the order ir 
is changed so that c is moved one place ahead of c' and part (ii) begins again with the 
altered permutation; If the new order ir' has been tried previously the algorithm fails. 

Note that if the algorithm terminates without failure it produces a stable matching. As men- 
tioned in the previous section, this algorithm will serve as a main tool in showing that there exist 
a stable matching in a large random market. 
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Kojima et al. (2010) used a similar algorithm but allows couples to apply in one order, i.e. if 
some couple is evicted their algorithm fails, even though there might be a different order of couples' 
applications which will not lead to such a failure. In our algorithm if some couple has been rejected 
the algorithm allows couples to a apply again using a different ordering. 

We next describe our algorithm formally. 
Sorted Deferred Acceptance Algorithm (SoDA): 

Input: A matching market (H, S, C, ^5, >~h, ^~c) an d a default permutation tt over the set {1, 2, . . . , \C\} 
Let IT = <j). 

Step 1: Find the stable matching [i produced by the DA algorithm in the matching market (H, S, ^5 
, >~h) without couples. 

Step 2 [Iterate through the couples]: Let i = 1 and let B = eft. 

(a) Let c = c^u) be the n(i)-th couple. 

Let c apply to the most preferred pair of hospitals (h, h') G H x H that has not rejected it yet. 
Lf such a pair of hospitals does not exist, modify fi such that c = (/, m) is unassigned and go 
to step 2(a) with i + 1. Lf such a pair (h, h') exists then: 

(al) If h = h' and {/, m} C Chh(n(h) U c) then: 

Let R = [i(h) \ Chh(n(h) U c) be the rejected doctors from h. 
(all) If there exist a couple d 7^ c for which {f c ',m c i} R then: Let j < i be such that 
c K{j) = c ' ■ Let 7r' be the permutation obtained by tt as follows: 

ft'ti) = 7r (^)? 7,7 (0 = ^(0 f or a M I such that I < j or I > i and > 7r(Z — 1) for 
other j + 1 < / < i. 

If tt' G IT terminate the algorithm. Otherwise add tt' to II and go to Step 1 setting 

TT = TT' . 

(al2) Modify /i by assigning c to h, remove R from Add R to B and do Step 3 

(Stablize) with the couple c. 

(a2) Ifh^h', f e Ch h (fi(h) U /), and m £ Ch h ,(fi(h) Um) then: 

Let R h = fi(h) \ Ch h (n(h) U {/}) and R h > = fj,(h') \ Ch h ,(fi(ti) U {m}). 

(a21) If there exist a couple d 7^ c for which {f c ',rn c /} n (Rh U Rh 1 ) then: Let j < i be 
such that c„-(j) = d , change tt as in step 2(all). If tt G II terminate the algorithm. 
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Otherwise add ir to IT and go to Step 1. 
(a22) Modify fx by assigning f to h and m to h! , remove Rh from fj,(h) and remove Rh> 
from n(h'). Add U R^ to B and go to Step 3 (Stablize) with the couple c. 

(a3) Otherwise, let h and h! reject the couple c and go to Step 2(a). 
Step 3 [Stabilize]: Let j = \B\. As long as j > 0: 

(a) If j = increment i by one and got to Step 2. 

(b ) Otherwise pick some s G B and: 

(bl ) Let h be the most preferred hospital s has yet to apply to. If such a hospital does not exist 
then modify the matching fi such that s is unassigned and go to Step 2(a). Otherwise: 
Let R = (/i(h) U {s}) \ Ch h (n(h) U {s}). 

(b21) If {f c ,m c } fl R then the algorithm fails. 

(b22) If there exist a couple c' ^ c for which {f' c ,m' c } n R then let i and j be such that 
c w(i) = c (c is the last couple that applied) and c w ^ = d . Change ir as in Step 
2 (all). If ir Gil terminate the algorithm. Otherwise add tt to IT and go Step 1. 

(b23) If sGR then go to Step 3(bl). 

(b24) Modify fi by assigning s to h, remove R from fi(h). Add R to B and go to Step 3. 

Observe that the SoDA algorithm fails to produce a matching in two cases: (i) if a couple c 
that finds a pair of positions causes a "chain reaction" leading to the same couple c being rejected 
(step 3(b21)), or (ii) it is about to let couples apply in and order that has already been tried before 
(steps 2(all), 2(a21) and 3(b22)) (it changes the permutation tt to a permutation n' that already 
belongs to IT). As mentioned above, if the algorithm does not fail it produces a stable matching. 

The following definition will be useful throughout the paper. 

Definition 1 (Evicting) Let d G D be a doctor and suppose that d is (temporarily) assigned to 
some hospital h. Let c G C. If during the execution of the SoDA algorithm some member of the 
couple c who is not assigned to h applies to h and causes d to be rejected by h, we say that d was 
evicted by c. Furthermore, if d was evicted by c, applies to some hospital hi and causes some 
other doctor d' who is assigned to h! to be rejected, we also say that d' is evicted by c, and so forth. 
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Finally, if d was evicted by c and d belongs to a couple c' we say that c was evicted by d . Formally, 
all doctors in the set R in steps 2(al), 2(a2) and 3(b2) are evicted by the applying couple c. 

Remark: According to this definition c can evict itself. Such a phenomenon may occur since 
one member of a given couple can evict the other member of the couple (in the algorithm this 
happens in part (b21)). 

3 A Large Market Model 

A random market is a tuple T = (H,S,C, >zh,Z,Q) where Z = (zh)heH and Q = (qh) q eH are 
probability distributions over H. 

The preference list of each single doctor d G S is independently drawn as follows: for each 
k = 1, . . . , \H\ given s's preference list up to her k-th most preferred hospital, draw independently 
according toZa hospital h until h does not appear in s's k most preferred hospitals and let it be 
s's (k + l)-th most preferred hospital. The preference list for each couple c = (/, m) is drawn from 
the distribution Q x Q. 

We will assume that the distributions Z and Q are uniformly bounded, that is there exist r > 1 
such that -2k- 6 [J,r] and G r] for every h,h' G H. Define jmax to be the maximum 
probability that a hospital is drawn either from Z or from Q, that is ^ m ax = max^gjj max((//j, Zh). 

We will consider a sequence of random markets r 1 ,]^ 2 ,... where T n = (H n , S n , C n , >^ 
, Z n , Q n ), i.e. markets with a growing size. 

Definition 2 A sequence of random markets T 1 ^ 2 , . . . is called regular if there exist < e < 1, 
A > 1, c > and r > 1 such that for all n 

1. \S n \ = n and \C n \ = 0(n 1-e ) (the number of couples grows almost linearly). 

2. for each hospital h G H n , kh < c (bounded capacity). 
3- J2heH n > An (excess number of positions). 

Importantly our results are true even if e is a 'slow' decreasing function of n converging to zero. 
The exact rate is discussed in the last section. 
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In their model, Kojima et al. (2010) assumed that each doctor's preference list is bounded by 
a constant, whereas in our setting all hospitals are acceptable to each doctor. A key step in their 
proof is to show that the number of unfilled positions grows linearly in n with high probability. 
Instead, we assume an excess number of positions. In fact one can show that under long preference 
lists, an excess number of positions is necessary for the existence of a stable matching even with 
one couple]^] 

Proposition 1 Consider a matching market with n — 1 singles, one couple c and n hospitals each 
of capacity 1. Then there exist preferences for hospitals such that no stable matching exists. 



The proof follows by a simple embedding of the (elegant) counter example by Klaus and Klijn 



(2005); In particular by letting the preference of each hospital h be m c >-h s >~h fc for every single 



4 Stability 

In this section we show: 

Theorem 2 Let I 11 ,!" 2 ,... be a regular sequence of random markets. Then the probability that 
there exists a stable matching tends to 1 as n goes to infinity. 

To prove Theorem [2] we will show that for random preferences the probability that the SoDA 
algorithm ends without failure converges to 1 as n goes to infinity. Before we prove the theorem 
we provide some intuition and a brief outline of the proof. 

4.1 Intuition and Proof Sketch 

The goal is to show that if the number of couples is m = n 1 ~ e (for any < e < 1) then as 
n approaches infinity the probability of a stable match approaches 1. To better understand our 
approach we begin with the intuition for why the result holds for any e < ^ (essentially we provide 



the intuition for the result by Kojima et al. (2010)). Then we provide intuition for how to obtain 

the result for e < | and finally for any e < 1. 

9 This is the only result that we require that doctors' preference lists are long - every hospital is acceptable to 
every doctor. 
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1. Number of couples is n 2 



1 _ r 

2 : Consider the following simplified version of the SoDA algorithm 



which we call the direct algorithm: after finding the stable matching in the market without couples, 
the couples apply one by one and if some couple evicts another couple the algorithm fails (i.e. it 
does not attempt to change the permutation over the couples) . Observe that if the algorithm does 
not fail, it outputs a stable matching. 

We will therefore bound the probability that a member of a couple will be evicted from a 
hospital. We do this iteratively. When the first couple applies, no other couple will be evicted (since 
there are no couples in the system). When the second couple c applies, what is the probability that 
it will evict the first couple? 

The second couple c creates a "chain reaction", which can cause several doctors who were 
temporarily assigned to continue applying. To bound the length of this chain consider f c . At some 
point she is temporarily assigned to a hospital h. If this hospital's capacity wasn't full, she did not 
evict any doctor and therefore also no other couple and we are done. Since there are more positions 
than doctors, the probability that the hospital has a vacancy is 1 — j (for simplicity we assume here 
that each hospital has capacity one and the preference distributions are uniform) . If the hospital 
has no vacancy, she evicts a doctor d\ who enters some hospital h\. If hi has a vacancy, we are 
done. If hi is full, a doctor 62 gets kicked out, and looks for a new position. Say c?2 is assigned 
to /i2- Again, /12 can have a vacancy, or be full, and this goes onwards. However, since at every 
step of the chain there is a constant probability for a vacancy, one can show that with probability 
1 — 1/n 3 the number of hospitals h, hi, hi, ■■■ in the chain is upper bounded by 3Alogn/(A — 1). 

Now, we can estimate the probability that the second couple evicts the first. The second couple 
kicks out doctors from at most 6Alogn/(A — 1) hospitals. If this list includes the hospitals which 
admitted the first couple, we could be in trouble. But since preferences are random, the chances 
that the second couple influences any of these hospitals are upper bounded by 



What about the third couple? Again, it influences at most 6Alogn/(A — 1) hospitals. But now, 
there are four hospitals which must not be influenced: two hospitals (at most) for each previously 
assigned couple. Generalizing this for the k-th couple and summing the probabilities we get 



2 • 



6Alogn 12Alogn 



(A-l)n (A-l)n 
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which goes to zero as n goes to infinity. Note that if m = y/n this argument would not hold. In fact 
an argument similar to the Birthday Paradox shows the direct algorithm fails with high probability 
if the number of couples is a large multiple of \fn. 

2. Number of couples is ns : The direct algorithm algorithm attempts to insert the couples 
according to a single permutation. A natural attempt to find a stable matching when more couples 
are in the market is to change the permutation each time a couple kicks out another couple. 
Consider the following addition to the direct algorithm: each time a couple Cj evicts a different 
couple Cj the algorithm starts over but swaps the order between a and Cj when the couples apply. 

Denote the initial order of insertion by c\, C2, . . . c m . If Cj evicts Cj for i > j, swapping places 
between Cj and c,- will cause j not to be evicted by Cj. However, this could create new "evictions". 
One can prove that the probability that any other couple "feels" that Cj and Cj have swapped places 
in the application order is at most 0(n~ 1 / 3 ~ <5 / 2 ). By a similar analysis as in the direct algorithm, 
the probability that any of the doctors who got evicted by Cj or Cj enters any of the hospitals of 
these couples is bounded by 

24n 2 / 3 -*logn < 
n 

What is left to bound is the number of swaps; again, the probability that ct will evict another 
couple is roughly ^ where we neglect the log n factor. Thus the expected number of couples which 
will evict another couple is bounded by — < re 1 / 3- " 5 . Informally, combining these together one 
obtains that with probability approaching 1 swapping will solve all "eviction" events, the algorithm 
will find a stable matching and will terminate successfully. 

Note that implicitly we assumed here that any pair of couples will swap places at most once. 
Unfortunately this approach is not sufficient and to formally obtain our result and we will need 
some more subtle structures. 

3. Number of couples is n 1 e (sketch of proof of Theorem [2]): 

The SoDA algorithm attempts to find an ordering of the couples, such that if couples apply one 
by one according to this order, no couple gets evicted by another couple. Whether or not a couple 
c evicts another couple d depends on the (current) matching and the preference profile. Identifying 
worst case scenarios, such as where c could "possibly" evict c' if there exist a configuration in which 
this happens, are too weak to prove our result. Instead, we devise a notion of whether c is "likely" 
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to evict c', and use this notion to analyze the algorithm. To do so we define for each couple c an 
influence tree; roughly speaking the influence tree of c consists of the hospitals and doctors which 
c is most likely to influence, i.e. the doctors and hospitals that are likely to obtain new matches 
due to the presence of c and its likely evictions. 

We will want to show that there are not "many" influence tree intersections, since intersections 
imply two couples might be able to influence the same hospital, and more importantly they might 
evict each other. A first key step in this direction is the following: 
(i) With high probability each influence tree is small (with respect to n). 

If influence trees had not intersected each other, one could have shown that any insertion order 



of the couples would yield a stable matching with high probability. Essentially Kojima et al. (2010) 
showed that if e < 0.5 then the probability that no two influence trees intersect approaches 1 as 
n — )■ oo. This however is not the case for all e < 1. 

Influence trees, their intersections and hospital preferences induce a useful structure in the form 
of a directed graph which we call the couples graph; roughly speaking, in the couples graph each 
couple is a node, and there is a directed edge from couple c to another couple c' if their influence 
trees intersect at some hospital h and c can possibly evict some doctor that caused h to be in the 
influence tree of c' (the doctor can be a member of the couple c'). We will show that the couples 
graph is sparse: 

(ii) With high probability all weakly connected components in the couples graph are 
smallE3 

Recall that an influence tree for one couple does not involve other couples. In the next step we 
verify that influence trees are indeed the "right" structure: 

(iii) With high probability if in the algorithm a couple c influences a hospital h under 
any ordering over the couples it, then that hospital will also belong to the influence 
tree of c. 



Finally, if one can find a topological sort it in the couples grapt 11 then by letting couples apply 
one by one according to tt yields a stable matching. We show: 



A weakly connected component in directed graph is a connected component in the graph obtained by removing 
the directions of the edges. 

11 A topological sort 7r is an order over the couples such that no couple has an edge to a couple ahead of him in the 
order. 



13 



(iv) With high probability there are no directed cycles in the couples graph. 
4.2 Proof of Theorem [2] 

We begin with defining influence trees. These will be defined for a fixed realization of the preferences 
and with respect to a parameter r which should be interpreted as "possible rejections". First we 
need a few notations. Let V = (H, S, C, >~h, >~s, >~c) be a matching market and let /ibea matching. 
Denote by o/ l (//) and by fh = kh — the number of assigned doctors to hospital h and the 

number of available positions in h under \x respectively. We also denote by d? h) to be the j-th 
least preferred doctor according to >~h that is assigned to h under \x. 

Definition 3 (Influence Tree) Let T = {H,S,C,>~h,>~s^c) be matching market with couples 
and let \x be the matching produced by the DA algorithm for the sub-market without couples. Let 
d G D and let r be any integer. An influence sub-tree of doctor d with root h and with respect to 
r , denoted by LT(d, r, h) is defined recursively as follows. 

(a) If = and d h) >~h d then let h! be be the next preferred hospital by d after h and 
let IT(d, r, h) = IT(d, r, h'). Otherwise 

(b) Change fi such that d is assigned to h and: 
(bl) Add (h,d) to IT(d,r,h). 

(b2) If r > or fh(fi) = —1 then: for each j = 1, . . . , min(o/ l (//), r — fh(fJ-)) let hj be the 
most preferred hospital by d 3 (fi, h) after h, and add to IT(d, r, h) the influence sub-tree 
IT(di(ji,h),r-i3-l)-f h (jj,),hj). 

For a couple c = {/, m} , let (hj, h^), . . . , (h r p h T m ) be the top r pairs of hospitals according to >- c 
in which the couple c can be accepted. That is, either 

• Kf = h' l m and c C Ch h y(fj,(hy) Uc), or 

. h) + hi and f G Ch h y(ji(fy U {/}) and m G Ch^J^) U {m}). 
The influence tree for the couple c is defined to be: 

r 

IT(c, r) := [J (lT(f, r + l-i, h))) U IT(m, r + l-i, h % m )) . 
i=l 
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First note that we allow fh(y) to be -1 in the definition of an influence tree (this is possible since 
under this definition we first assign a doctor to a hospital and only then reject from that hospital.) 
Also observe that each time a hospital h is inserted to the influence tree, a doctor d is associated 



with it. In this case we say that h was inserted to IT(c, r) by d 12 With a slight abuse of notation 
we will write h 6 IT{c,r) if there exist a doctor d such that (h,d) £ IT(c,r), i.e. h £ IT(c,r) is 
inserted to d by some doctor. 

In the definition of an influence tree for c, no other couple other than c involved; the definition 
in fact simulates the presence of other couples, or in other words it simulates an adversary that can 
"reject" doctors from settling in a hospital h due the possible additional occupied positions that 
will possibly be taken due to the presence of other couples. The adversary is allowed to reject r 
times (above the natural rejections). Importantly, Definition [3] allows us to analyze a static setting 
rather than a dynamic setting in which at each point a different number of couples already applied. 

Before we continue with the proof we illustrate the definition of an influence tree in the following 
example. 

Example 1 Consider a setting with 6 hospitals each with capacity of 2, 5 single doctors, d\, d2, ■ ■ • , d§ 
and two couples c\ = {dQ,dj) and C2 = (dg,dg), and let their preferences be as in Table [7J To sim- 
plify the illustration we chose a preferences that does not "seem" to be drawn randomly. 

The Deferred Acceptance algorithm for the market without couples produces the matching given 
in the boxes as in Table [7| The influence trees of c\ = (dg, dg) with parameters r = and r = 1 are 
given in Figure^a). For r = the tree captures the "chain reaction" that c\ causes after entering 
the first pair of hospitals that accepts it, these the pair of hospitals (h^, h^) . For r = 1, the tree Had 
c\ would be rejected from the pair (h^, h±) note that the next pair that would have accepted it would 
be {hi, h§). Thus the influence tree of c\ includes with r = 1 includes both its tree for r = and the 
chain reaction it causes had it been accepted to (hs,hi) (see Figure^b)). Similarly the influence 
tree of couple C2 = (dg,d$) is given in Figure [7| 

At this point we fix r to be r = 4/e for some fixed < e < 1. One should interpret this r as a 
"small" number of possible rejections (relative to n). In random market the influence trees (IT's) 
are random variables. 



12 We do not rule out here that h was inserted to the influence tree by two different doctors. We will later show 
that the probability of this even is negligible, however. 
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Table 1: Preference lists. 

Lemma 3 1. For every hospital h couple c, Pr(/i £ IT(c,r)) = O ((log n) r+l /n) . 

2. The probability that the size of every influence tree IT(c,r) is 0((logn) r+1 ) is at least 1 — n~ 3 . 

3. The probability that for all couples c, each hospital h appears in IT(c, r) at most once is at 
least 1 - n~ e / 2 . 

Proof: We begin with the second part. Let c be a couple. For each of the two d G c and for each 
ft' / /i we will give an upper bound of O ((logn) r /n) on Pr(/i G IT(d, r, h')). The claim will follow 
from the definition of IT(c, r) and union bound. 



(h 3 d 6 ) (h 4 d 7 ) 



r=0 



(h 5 d 4 )*^" 

(Me) < h 5, d 7 

(a) Influence tree of ci = (dQ,di) 
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(b) Influence tree of C2 = (dr,dg). 
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Figure 1: Influence trees with parameters r = and r = 1. 
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An alternative way of viewing the recursive definition of IT(d,r,h'), is as follows: doctor d 
proceeds down his list beginning with h! until he finds the first hospital willing to accept him. If 
d is accepted into a hospital h\ and h\ was full to capacity, then some doctor d! is evicted and 
goes to a hospital /12, and we add IT{d! ,r, ^2) to IT(d,r,h'). In this case, continuing the "chain 
reaction" did not require any arbitrary rejections. We call the hospitals added into IT{d, r, h!) with 
parameter r the main path of IT(d,r,h'). We then also allow the adversary to introduce up to 
r arbitrary rejections (for example, precluding d from being accepted into hi). Thus the influence 
tree is composed of the main path, with lower-order influence trees (i.e. influence trees with a 
strictly smaller value of r) attached along it. 

We first show by induction that with probability at least 1 — n -6 the length of the main path 
in IT(d, r, h') is at most blogn, where 6 = 6- Cma *'J™ ax . At any step along the main path, for the 
main path to continue, the currently evicted doctor d needs to choose a full hospital h. Because 
of the way the doctors' preferences are sampled, the probability of this happening is bounded by 
1 — - — ^5jr — • Since each subsequent step along the path is independent from the previous ones, 
the bound follows. 

By union bound, we see that with probability at least 1 — n~ 4 all potential main paths contain at 
most 61ogn hospitals. Each main path of length I recursively gives rise to at most r ■ I lower-order 
influence trees (i.e. influence trees with smaller r) that are added to IT(d,r,h'). Thus we can 
prove by induction that for each r, the size S(r) of the largest order-r influence tree is bounded by 
(1 + br\ogn) T+l = 0((logn) r+1 ). For the base case, an influence tree with r = only contains the 
main path, and thus 5(0) < Mogn. For the step, we get 

S(r) < blogn + (fclogn) • r • S(r — 1) < blogn + (blogn) • r • (1 + br logn) r < 

(1 + br logn) r + (blogn) ■ r ■ (1 + br logn) r = (1 + br logn) r+1 . 

Next, the first part of the lemma follows from the proof of the second part and the fact that 
the hospitals that are added to iT(c, r) are hospitals on the doctors' preference lists and are chosen 
independently. Thus the probability of h to be added to IT(c, r) at some point is bounded by 
S(r) ■ (c max • Jmax/n) = 0((logn) r+1 /n). 

Finally, we show that IT(c,r) does not "intersect itself except with probability < n e / 2 . Note 
that in particular this means that the members of the couple may not apply into the same hospital 
or evict each other. We have seen that the probability of a hospital h belonging to IT(c,r) is 
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bounded by 0(S(r)/n). Similarly, the probability of h to be added twice or more to IT(c,r) is 
bounded by 0(S(r) 2 /n 2 ). Taking a union bound over all possible hospitals h and all possible 
couples c, we see that the probability that any hospital appears in any IT(c, r) twice or more is 
bounded by 

0{S{rf/n 2 ) -n-n 1 -" < n~ e/2 . 

□ 

Throughout the remainder of the proof we will assume that each hospital appears in each 
IT(c,r) at most once, neglecting an event of probability < n _e / 2 . 

In fact, in Lemma [3j one can prove a stronger bound of O (log n/n) for the probability that a 
hospital belongs to an influence tree. Although we do not prove or use the stronger bound in the 
rest of the paper, it provides intuition for why the SoDA algorithm works well in even in a rather 
small market (e.g. when n = 256 we have (log256) 3 = 8 3 = 512 which does not explain why the 
algorithm works). 

Next we analyze how much influence trees intersect with each other. Let c\ and C2 be two 
different couples. We say that two influence trees IT(c\,r) and IT(c2,r) intersect at hospital h if 
there exist d! and d" such that d! / d", (h,d') € IT(a,r) and (h,d") G IT(c 2 ,r)^_ 

Lemma 4 No two influence trees intersect more than once, except with probability < n~ t / 2 . 

Proof: By Lemma [3j we can assume that for every couple c the size of IT(c, r) is at most 
O ((logn) r+1 ). For the remainder of the proof, we will denote this upper bound on the size of 
IT(c,r) by S(r) = O ((logn) r+1 ). Recall also that we have assumed that no IT(c,r) intersects 
itself. 

We prove that with high probability no two influence trees intersect exactly 2 times. A similar 

proof shows that for every 3 < k < S(r) no two influence trees intersect exactly k times. The proof 

will then follow by a union bound on k (since the size of each tree is < S(r) with high probability 

they cannot intersect more than S(r) times). 

13 It is possible that if two influence trees intersect they will have other nodes (h, d) in common, since there might 
be common paths that continue from the point they intersect. 
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Let ci, c 2 be two couples, and hi, h 2 be two hospitals. We want to bound the probability of the 
event 

Pr(h 1 ,h 2 £ IT{ci,r)nIT{c 2 ,r)) = Pr(h u h 2 € IT(a, r)) ■ Pr(/ii, h 2 € IT(c 2 ,r)\h x ,h 2 £ IT(c x ,r)). 

(1) 

We first note that if h\ is an ancestor of h 2 in, e.g. IT(c\,r), and IT(c\,r) intersects IT(c 2 ,r) in 
both hi and h 2 , then the influence tree IT{c 2 , 2r + c max ) will self-intersect at h 2 . The hospital /12 
will be added to IT(c 2l 2r + c max ) twice: once following the path in IT(c 2 ,r), and a second time 
through hi and then following the path from hi to h 2 in IT{ci,r). Since 2r + c max is a constant, 
by Lemma [3] the probability that any IT(c, 2r + c max ) will self intersect is smaller than n~ e / 2 , and 
can be disregarded. Thus we can assume that hi and h 2 are not each other's ancestors in either 
IT(ci,r) or IT(c 2 ,r). 

We begin by calculating the probability of the first event in ([!]). A similar proof to that of 
Lemma [3] gives that the probability for this event is 

Pr(hi,h 2 € IT(ci,r)) = O 

Rather than compute Pr(/ii,/i2 £ IT(c 2 , r)\hi, h 2 £ IT{ci,r)) directly, to avoid the condition- 
ing, we consider inserting c 2 into a modified world, in which all hospitals in IT{ci,r) except for 
{hi, h 2 } and all the doctors in these hospitals do not exist. We argue that in this case, 

Pr(hi,h 2 € IT(c 2 ,r)) = O 

using similar reasoning. 

The influence tree generated in the modified algorithm (where we took out some of the hospitals) 
may differ from the one in the "real" algorithm. Note however that if removing IT(c x ,r) affects the 
generation of the tree IT(c 2 ,r) before it reaches hi,h 2 , then it is the case that IT{c 2 , r) intersects 
IT(ci,r) at another hospital (which comes before h x ,h 2 ). But this is a contradiction, since we 
assumed IT (ci, r), IT(c 2 ,r) intersect exactly twice. 

Multiplying the probabilities, we get that 

Px{hi,h 2 £ IT(ci,r)nIT(c 2 ,r)) = O 
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Taking a union bound over 0(n) hospitals and n 1 6 couples, bounds the probability that exist two 
couples which intersect exactly twice is at most 

We do not present the proof for exactly k intersections, and only state that the probability for that 
event drops at a rate of 

S(r) 2k S(r) A 



< 



n k-e n 2e 

Taking a union bound over all possible values of A:, we get that the probability that any two couples 
intersect strictly more than once is at most 

()l 'S(r)- S(r) 4 \ _ polylog(n) 



n 2e J n 2e 

as required j^] □ 

Observe that in the definition of an influence tree for a couple c, no other couple is involved and 
therefore the tree captures only what possibly could have happened had there been other couples. 
The SoDA algorithm inserts couples one by one after the DA algorithm has terminated, and if some 
couple ci evicts another couple C2 the order of their insertions is altered so that c\ is moved ahead 
of C2- Intuitively the intersection of two influence trees, of c\ and of C2, together with the hospital 
preferences will provide a good guess which couple to insert first. This motivates the following 
definition of the couples graph: 

Definition 4 Let T = (H, S, C, >~b> ^Sj ^~h) be a matching market and let r > 0. In a (directed) 
couples graph for depth r > 0, denoted by G(C, r) the set of vertices is C and for every two 
couples ci,C2 £ C there is a directed edge from c\ to C2 if and only if there exist h G H and 
di,d2 €: D (d\ 7^ di) such that (h,di) £ IT(a,r) and (h,d2) £ IT(c2,r) and d\ >-h di- 

Before we continue we illustrate a couples graph. 

Example 2 Consider the same market as in Example^ (see Table [7]). Note that the influence 
trees with r = 1 intersect in /13 where (/13, c^) £ IT(c2, 1) and (/13, do) G IT(c2, 1). Since s^ >~h 3 ^2 
the couples graph with r = 1 is as in Figure [1| Indeed letting c\ apply before C2 ( after the DA stage ) 
will end without any couple evicting each other and in a stable matching. 
14 We write polylogn for a polynomial in logn. In particular polylog " 
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0— »© 

Figure 2: Couples graph for r = 1. 



Our goal will be to show that with high probability the graph G(c, r) can be topologically sorted; 
such a sorting corresponds to a "good" insertion order of the couples in the SoDA algorithm. In 
example [2] the order c±,C2 is a topological sort. 

In a couples graph G = G(C, r) a weakly connected component is defined to be a connected 
component in the graph obtained from G by removing the direction of the edgesp 



Lemma 5 With probability > 1 — 1/n the largest weakly connected component of the couples graph 
has size at most |. 

Proof: We will first consider an arbitrary set of | couples and show that the probability that 
they form a weakly connected component is very small. The statement of the lemma will follow 
through union bound. Let I = (pi, C2, . . . , c | 3/eJ ) be a sequence of couples with no repetitions: 
q ^ Cj. Let Aj be the event that for every 1 < i < |_3 / ej the influence tree of Cj intersects with 
one of the previous influence trees, that is 

jr(c i) r)P(U i < i jr(c i) r))^0. 

We first show that 

p (A w (S(r) 2 -c mQX - 7max -3/e)L 3 AJ (S(r) 2 • c max ■ lmax • 3/e) 3 A 

where S(r) is the bound on the size of the influence trees IT (a, r) as in Lemma [3j 
Let 

ITi = Uj<iIT(cj,r) 

be the union of the influence trees of the first i couples. The probability of Aj can be written as 

Pr(4r) = Pr (JT(2, r) n ITi ^ 0) • Pr (JT(3, r) n IT 2 ^ 0| JT(2, r) n Jr a / 0) • • • • 

Pr(/r(L3/eJ,r)n/r L3/eJ _ 1 /0|Vi< |3/e-lJ, IT(j, r) n JT^x / 0) . (3) 

15 A set of nodes in an undirected graph is called a connected component if there exists a path between each to 
nodes in the set. 
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All the interactions that cause the influence trees within ITj—\ to intersect happen within ITj_i, 
and conditioned on the set ITj-i of hospitals do not affect the probability of IT(cj,r) intersecting 
ITj-\. Hence for every j = 2, . . . , [3/ ej , 

Pr (IT(cj,r) n ITj-x ^ 0|V2 <l<j-l, IT {I, r) n ITi-l + 0) = 

Pr (IT(cj,r) n iTj_i ^ | JT^) . 

Furthermore from Lemma [3] it follows that the probability that |JT(q,r)| < S*(r) is at least 
1 — ^3 and therefore \ITj\ < j ■ S(r). Hence, 

Pr (JT(ty,r) D IT^ X ± | IT^) < — - + ~ z < . 

Since there are [3 / ej — 1 terms in Q we derive inequality 

To finish the proof, observe that if there is a connected component of size at least 3/e then 
there exists a sequence / such that Aj holds. Since there are n 1 ~ e couples there exists fewer than 

( n l-) 3 /* = n 3A-3 

such possible sequences /. Therefore using a union bound over all of them proves the lemma. □ 

Recall that we ignore all realizations of preferences at which two influence trees intersect more 
than once (in particular there is at most a single edge between every two couples in the couples 
graph). From now one we also ignore realizations where the largest weakly connected component 
of the couples graph contains more than 3/e couples. 

Lemma 6 With probability 1 — O (^) the couples graph has no directed cycles. 

Proof: We first prove the following claim, that is basically a simple general statement about 
directed graphs: 

Claim 1 If the shortest directed cycle has length k, it involves k different hospitals. 

Proof: Suppose the shortest directed cycle is of length k and consider such a cycle c\ — > 
C2 —>••••—> Ck — > c\ . Suppose couples c\ and C2 intersect at h due to d\ and di respectively, 
i.e. (h,di) £ IT(c\,r), (h, cfe) 6 IT{c%,r) and (h, cfe) £ IT(c2,r). Assume for contradiction 
that for some 2 < i < k, Ci and Cj+i (i is taken modulo k) intersect at hospital h due to some 
doctors di and dj+i, i.e. (h,di) G IT(ci,r), (h, di+i) € IT(ci + i,r) and di >~h d; L+ i. Consider 
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the case in which di d 2 . In this case a cycle of length less than k exists which consists of 
c 2 ->■ C3 ->■ • • • — >■ q — >■ c 2 . If d 2 >:/i dj, i.e. either d 2 or d 2 = d i} then <ii >-/, c£ 2 di >~h d i+ \ 

implying that c\ — > a+i —)••••—)• — )• ci is a shorter cycle. □ 

To prove the lemma it is sufficient to show that the probability that the shortest directed cycle 
has length k is O ^^2= ) since by taking the sum of these probabilities over all values of k gives 
the result (note that the the dominant term in this sum is when k = 2). 

We proceed in a manner similar to that of the proof of Lemma [5] Let / = (ci, c 2 , . . . , Cfc) be 
a sequence of couples without repetitions q / Cj. Let J = (/ii, /i 2 , . . . , h^) be a sequence of k 
hospitals without repetitions hi 7^ hj. Let j be the event that for every i = 1,. . . ,k, IT(ci,r) 
and IT(ci+x,r) intersect at hospital hi. Applying Lemma [3j and using reasoning similar to the 
proof of Lemma [5] the probability of the event Aj t j can be bounded by 

(25(r)- 7 „ ^ 2k 



Pr(A7,j) < 



Imax ) 

\2k 
^max I 



(An/c 

Since there are < An positions and n 1 ~ e couples, there are X k n k n^ 1 ~ e ^ k such different events Aj t j. 
A union bound over all these events implies the lemma. □ 

For the analysis we will consider the event that the couples graph contains a cycle as a failure 
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If the couples graph does not have cycles, then it has a topological sort. Let ir denote any topological 
sort of G. We claim that inserting the couples according to it will result in a stable matching with 
couples. Moreover, we will show that a failure of the SoDA algorithm corresponds to a backward 
edge in the couples graph p] 

The next lemma shows that the influence trees indeed captures "real influences". 

Lemma 7 Suppose we insert the couples as in the SoDA algorithm according to some order tt until 
a couple evicts another couple or until all couples have been inserted. If a couple c is inserted and 
influences hospital h , then h £ IT(c, r) . 

Proof: Recall that we consider only "small" weakly connected components (Lemma [5] upper 
bounded the probability that such a component is large). Let c be the couple currently being 
inserted, and assume that the statement of the lemma was true for couples inserted before c. 



16 The presence of a cycle does not necessarily imply that there is no stable matching. In fact the SoDA will often 

find stable matchings even when there are cycles in the couples graph. 

17 A backward edge is an edge from a newly inserted couple to a previously inserted one. 
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Let {ci, . . .Cfc} be c's weakly connected component in the couples graph, where k < 3/e, ordered 
according to their insertion order in ir. We prove by induction a stronger claim, namely that if 
c = Ci influenced a hospital h, then h £ IT(c, i — 1). 

Suppose that c = Cj is currently being inserted and that its insertion affects a hospital h. 
Consider the path of evictions that was started by c and led to hospital h being affected. There 
are two types of evictions along this path: the first type would have occurred even without any 
other couples present. The second type occurs because a hospital h! on the path has already been 
affected by a previously inserted couple Cj. If this happens, then the influence tree of c intersects 
the influence tree of Cj and thus Cj in in the weakly connected component of c in the couples graph. 
Moreover, since influence trees intersect only once, evictions due to influences from previously 
inserted couples happen at most i — 1 times: at most once for each previously inserted couple in 
the weakly connected component of c. By the definition of IT(c, i— 1) this implies h E IT(c, i — I). 

□ 

As an immediate corollary of Lemma [7] we obtain that a couple causing another couple to be 
evicted corresponds to an edge in the couples graph. 

Corollary 8 If in an insertion order tt inserting the couple c^u) causes the couple c^u) to be evicted 
(j < i) then in the couples graph there is an edge from c n ^ to c n rj\ . 

Since there exist a topological sort with a high probability Theorem [2] follows from the following 
corollary: 

Corollary 9 Inserting the couples according to any topologically sort it of the couples graph gives 
a stable outcome. 

Finally, we can now analyze the running time of (a slight modification of) the SoDA algorithm. 
Note that with high probability we have that the couples graph has small connected components 
(of size < 3/e) and can be topologically sorted. According to Corollary [9] each failed iteration of 
the SoDA algorithm is due to a backward edge in the insertion order tt. By recording the backward 
edge, and ensuring that all future attempts are consistent with it, we can guarantee that at most 
(3/e) 2 • n 1 ~ e permutations will be tried before either a topologically sorted order is arrived at, or a 
cycle in the couples graph is found ^ 

18 It can be shown that the SoDA algorithm without this modification will run with at most (3/e) 3 ' e -n 1_e iterations. 
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5 Incentive Compatibility 



In this section we will show that: 

Theorem 10 Ex post truthfulness: The probability that any doctor can gain by misreporting her 
preferences is at most 0(n~ e / 2 ), even if the doctor knows the entire preference list. 

A similar result can be shown for hospitals, using similar techniques as in the proof of Theorem |10| 



We avoid the exact details here Together with Theorem 10 we obtain that reporting truthfully 



is a 5-Bayes Nash equilibrium in the Bayesian game induced by the SoDA algorithm (assuming 



bounded utilities). We refer the reader for exact definitions of the Bayesian game to Kojima et al. 



fl2010p . 

Throughout this section we will use the same assumptions as in the previous section about the 
influence trees. They hold except with probability 0{n~ t / 2 ). Informally, we will show that if a 
doctor or a couple doesn't interact with any other couple's influence tree, then she does not have 
an incentive to deviate. To this end we show: 

Lemma 11 Let d £ S be any doctor. Suppose that the SoDA algorithm terminates and assigns d 
to a hospital h in the first (Deferred Acceptance) stage of the algorithm. Suppose that h does not 
belong to any of the couples ' influence trees. Then d may not improve her allocation under SoDA 
by misrepresenting her preferences. 

Similarly, if c £ C is a couple whose influence tree is disjoint from all other influence trees, 
then c may not improve their allocation under SoDA by misrepresenting their preferences. 

Proof: We start with the first statement. At the end of the execution of the first stage of the 
SoDA algorithm d ends up in h. By Lemma [7j if d was moved from h, in the second stage, then 
h must belong to the influence tree of one of the couples, contradicting the assumption. Hence at 
the end of the SoDA algorithm d is still assigned the hospital h. 

Suppose that d misrepresents her preferences and obtains a hospital h' such that h! >~d h in a 
valid execution of the SoDA algorithm. It is well known that the outcome of the (regular) Deferred 
Acceptance algorithm on singles does not depend on the insertion order. Hence we can execute the 



3 In particular one will need to define influence trees for hospitals, show that with high probability a hospital does 



not encounter any couple, and with a bit of effort apply Lemma 10 in Kojima and Pathak (2009) which asserts the 
desired result for hospitals in markets without couples. 
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SoDA algorithm so that d is the last single doctor to be inserted. Just before d is inserted, for all 
doctors d' that are assigned to h', d' >~h' d, otherwise d would have been assigned h' when stating 
her true preferences. From that point on, a valid execution of the SoDA algorithm does not lead 
to any couples being evicted, and hence the quality of the least preferred doctor in h' according 
to >~h' may only improve. Hence d may not be assigned to b! in the second phase of the SoDA 
algorithm. Contradiction. 

Next, let c = (/, m) be a couple such that IT(c, r) is disjoint from all other influence trees. 
Suppose that c is assigned the hospitals (h\,h,2) is a valid execution of the SoDA algorithm with 
an ordering ir on couples. Since IT(c,r) is disjoint from other influence trees, by Lemma [7] we see 
that inserting the couples in the order tt' obtained from tt by putting c first, leads to another valid 
execution that results in the same allocation. 

Suppose that c misrepresent their preferences and obtain the hospitals {h\, h' 2 ) >~ c (hi, /12) in a 
valid execution of the SoDA algorithm. Note that the couple c was the first to be inserted under 
tt' and did not get accepted into (h'i, h' 2 ) because one of the hospital preferred all the doctors that 
were assigned to it in the DA stage of the algorithm to the corresponding couple member. Without 
loss of generality, assume that h! x preferred all of its assigned doctors to /. As in the single doctor 
case above, in the second phase of the SoDA algorithm the least preferred doctor according to 
that is assigned to h\ may only improve. Thus / may never be assigned to h\. Contradiction. □ 



Using Lemma 11 we can now prove Theorem [10 



Proof: (of Theorem 10). Fix any doctor d £ S and the hospital h it is assigned in the DA 
stage of the SoDA algorithm. By an argument very similar to Lemma [3] we can show that the 
probability that any influence tree contains h (or any other hospital in the influence tree of d) is 
bounded by 0(S(r) 2 /n e ) < n~ e / 2 . By Lemma 
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if this is the case, d does not have an incentive 



to deviate. 

Similarly, the probability of the influence trees of two couples intersecting is bounded by 
0(S(r) 2 /n), and thus for each couple c, the probability that IT(c,r) is disjoint from all other 
influence trees - and thus c has no incentive to deviate - is at least 1 — 0(S(r) 2 /n e ) > 1 — 0(n~ e / 2 ). 

□ 
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6 Simulations 



In this section we provide simulations results using the SoDA algorithm . In particular we performed 
sensitivity analysis on various parameters of the problem. For each configuration we ran 600 trials. 
We assumed there are ^ hospitals where n is the number of singles and each hospital has capacity 
of 3J 
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In the first simulation we fixed the percentage of couples in the market and found the success 
rate of finding a stable matching. For comparison, in the NMRP match in 2010 the number of (U.S) 
doctors was about 16,000 where as the number of couples was about 800 PI As Fi gure [3] shows that 
the ratio of doctors that are members of couples plays a crucial role in the probability that a stable 
match will be found. Note that although the number of singles grows (and the number of couples 
is linear) the probability for finding a stable match appears to remain unchanged. 




Figure 3: The success rate for finding a stable outcome given the number of singles (x-axis), for 
different couples percentages (5% means that 10% of the doctors are members of couples). 



Next we fixed e, i.e. the number of couples is n 1_e . Figure [4] shows that the probability 
for finding a stable match with SoDA increases and is roughly concave in the number of singles. 
Observe that the rate of convergence is different for various e's. 

In the next simulation (see Figure [5]) we fixed the number of singles and the number of couples 

20 The results can be slightly improved by randomizing a new insert order each time the algorithm fails (doing this 

a small arbitrary number of times). 

21 In fact in the NRMP more than 20,000 doctors participate, but 16,000 are from the US and are ranked higher in 

the match. 
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A> «# J> £ J> J> J> J> cfP ^ rfP 4? ^ 

Figure 4: The success rate in finding a stable outcome given the number of singles (x-axis), where 
the number of couples is n 1 ~ e for three different e's. 

to be 16,000 and 800 respectively as in the NMRP, and found the percentage of singles and couples 
that get their k-th most preferred choice. We assumed that there is no fitness, i.e. preference 
distributions of both doctors and hospitals are uniform. 




Figure 5: The histogram shows the percentage of singles and couples that got their k-th favorite 
choice for each k = 1, .... 8. 



In Figure [6] we provide the same histogram but adding fitness to hospitals; each hospital has 
been assigned a score uniformly at random from the interval [0.2, 1]. To decide the next preference 
of a doctor, she randomizes uniformly a hospital h and a number from [0.2, 1], and if /i's score is 
below the number, the doctor resamples such a pair. 
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Figure 6: The histogram shows the percentage of singles and couples that got their k-th most 
preferred choice for each k = 1, . . . , 8. Hospitals have a fitness score. 



7 'Almost' Linear is Necessary 

In Section [4] we showed that the SoDA algorithm finds a stable matching with probability approach- 
ing 1 as n tends to infinity assuming the number of couples is growing at a rate of n 1 ~ t (for any 
< e < 1). In Section[6] we saw that when the number of couples is a constant fraction of the total 
capacity, there is a constant probability of failure. One might suggest that the SoDA algorithm 
does not search through enough permutations and if it fails there might still exist a stable matching. 
We show that not only SoDA will fail with constant probability but also any other algorithm, i.e. 
with constant probability a stable matching does not exist. For simplicity we will consider only 
uniformly distributed preferences and a capacity of 1 for each hospital. 

Theorem 12 Consider a random matching market with n couples and n singles, An hospitals 
for sufficiently large A each of capacity 1, and preferences distributed uniformly. Then with some 
probability delta > not depending on n, no stable matching exists. 
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Proof: Consider the following event E: there exist a couple c = (m c ,/ c ) 6 C, a single doctor 
s £ S and two hospitals h\ ^ hi so that the most preferred pair of hospitals by c is (hi, h-z) and 
the following properties hold: 

(i) hi y s h\ >~ s h for any h ^ {hi, /12}. 



The result is true also for an couples for any constant a > 0. 
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(ii) s y hl m c . 

(iii) f c >- h2 s. 

Observe that if only the couple c and the single doctor s existed no stable matching would exists. 

The proof will follow by first bounding (from below) the probability of the event E and then 
bounding (from above) the event that some other doctor except those in the event E ever obtains 
either h\ or hi in any stable matching. 

Fix a couple c G C and a single s and let (hi, h<i) be the pair of hospitals most preferred by c. 
The probability that h\ 7^ hi, and properties (ii) and (iii) hold is 5 > ^ • 52 . The probability that 
hi ^ h% and properties (i)-(iii) hold is SI (^pjjpj = ^ (^?)- Therefore, since there are n couples 
the probability that for a given single s there exists a couple c such that h\ 7^ /12 and properties 
(i)-(iii) hold is Q (-). Therefore since there are n singles, the probability that there exist a single 
s such that the event E holds is some constant 7 > 0. 

Suppose the event E occurs with the couple d and doctor s' and let D' = D \ {f c /,m c i,s'}. 
Consider the following application/rejection algorithm in which doctors are assigned to I > 
positions (rather than 1): 

l-Pessimistic DA: At each step t = 1,2..., either a single doctor s G S or a couple c G C 
that has less than I temporary assignments are chosen at random and apply to the most preferred 
hospital or pair of hospitals on their list respectively that they haven't applied so far. Each hospital 
assigns a doctor d if and only if no other doctor is currently assigned to h and no other doctor 
applied at this step to h. If some doctor d applies to h and some doctor d' (could be that d! = d) is 
temporarily assigned to h, h rejects both d and rf'p*] 

We will first show that the probability that any doctor but f c > , m c i and s' ever applies to h\ 
or hi in the 3-Pessimistic DA process is bounded from above by a small constant. We will show a 
stronger lemma: 

Lemma 13 With constant probability no more than an hospitals are visited in this process for 
some a < A in the 3-Pessimistic DA. 

Proof: Let L = {0, 1, 2, 3}. For every q G L we say that a doctor is q-settled if it is temporarily 
assigned to exactly q positions and we say that a hospital h is visited if some doctor applied to it 
23 As usual if a member of a couple is rejected from some hospital, its other member is also rejected. 
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during the 3-Pessimistic DA process. 

For every t = 0,1,2 ... , and every q G L denote by Af the number of g-settled doctors at step 
t, by Vt the number of visited hospitals up to step t, where A® = 3n, and Aq = Aq = Aq = Vo = 0. 
Let Y t = V t + 15A ( l + WAj + 5Af and consider the process X t = Y t + t for every t = 0, . . . , mm(J, K), 
where K is the first step in which Vk = j§ ano - J 1S ^ ne ^ rs ^ s ^ e P m w hich A® = Aj = A 2 = (i.e. 
A 3 j = 3n). 

Claim: X\, Xi . . . , is a super-martingale, that is for every t > 0, E{X t +i\Xi, . . . , X t ] < X t . 
Proof: Suppose a couple c is chosen at step t and has q G L \ {3} temporary assignments. If 
it applies to two unvisited hospitals then A^l = Aj +1 + 2 and A\ +l = A\ - 2 and A\ +1 = A\ 
for q' G L \ {q, q + 1}. Thus the contribution of the couple to Y t drops by 10. If c applies to 
an unvisited hospital and one visited hospital then for every q G L, A q t+1 < A\ + 2 since at most 
one other couple lost a temporary assignment. If it applies to two visited hospitals then for every 
q G L, A^ +1 < A\ + 4 since at most 2 additional couples lose a temporary assignment. For singles 
similar bounds can be used. For each q = 0,1,2 let Q\ be the event that at the beginning of 
step t a couple with q temporary assignments is chosen, and by the event that a single with q 
temporary assignments is chosen. Therefore for every q G L \ {3} 

. . . , X t , Q q t+l ] = E[X t+1 \X t , Q q t+1 ] < {Xn ~J^f {V t + 2 + 15A° t + 10A\ + 5A 2 t - 10) + 

2 • (A " ~^ Vt {V t + 1 + 15A° t + 10AI + 5A 2 t + 10) + 
V 2 

" (V t + 15A° t + 10A] + 5A 2 + 20) + t + 1 < V t + 15A? + 10^J + 5A 2 + 



(An 

where the last inequality holds for any Vt < Similarly, 



E[X t+1 \X u ...,X t , W t q +1 ] = E[X t+1 \X t , Wt+i] < {XU Xn Vt) {Vt + 1 + 15A° t + 10A, 1 + 5A 2 - 5) + 

^ (Ft + 15A° t + lO^ 1 + 5A 2 + 10) + t + 1 < 14 + 15A t ° + 10^J + 5A t 2 + t. 

Therefore since either a couple or a single are chosen at each step, we obtain that £?pQ + i|X t ] < 
V t + 15,4? + lOAj + 5A 2 +t.D 

As argued in the claim \Xt+i — Xt\ < 22 for every t > 1. Therefore by Azuma-Hoeffding's 
inequality we obtain that for any T > 1 

Pr I V T - V > — J < Pr I X T - X > — - 45n + T J < e mr <l-/3, 
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for some constant j3 > and a sufficiently large A, i.e. with constant probability the process will 
never reach ^ visited hospitals. □ 

Lemma [13] provides that in the 3-Pessimistic DA process described above, the number of hos- 
pitals visited is with constant probability only a fraction of the total hospitals will be visited, also 
implying that the doctors in the process (all but c' and s') will never visit hi and /i2- 



Lemma 13 also implies that with constant probability the 3-Pessimistic DA terminates and 
each player i, single or couple, obtains 3 different temporary assignments, pj ,pf and p^ (thus if i is 
couple, prj is a pair of hospitals) and observe that pj >~i pf >~i pf. 

To finish the proof we argue that in every stable matching, no agent i will be assigned to a pair 
of hospitals less preferred to p\. Call a player i (a single or a couple) which gets a hospital less 
preferred to pf poor, and let K be the set of poor player. Suppose that \K\ = k > 0. For a player i 
to be poor, at least one hospital in each pj ,pf and p\ should be taken (if i is a single then all pj are 
single hospitals and all should be taken). Since for each two players j, I, {p}j-,p 2 j,p^}r\{p}-,pf,p^} = 
there are at least 3A; hospitals which need to be assigned. These hospitals cannot be assigned to 
players that are not poor (since they get better choices for themselves). Since there are only k 
poor players, with a total of up to 2k doctors, they cannot be assigned to all 3k hospitals - a 
contradiction. □ 

Remark: If one assumes that doctors have constant length preference lists, the proof is significantly 
simpler; indeed one can show directly that the probability that hi and /12 are not acceptable for 
any doctor is constant. 



8 Conclusion 

We showed using the SoDA algorithm that if the number of couples grows at a rate of \C n \ = n 1_e , 
then there exists a stable matching with probability approaching 1. One can argue that "in real 
life" the number of couples is indeed a linear fraction of the number of doctors, and the rate 
\C n \ = ra 1-6 does not make sense. However, our correctness proof is only a lower bound on the 
performance of the algorithm, and it may perform much better in practice. Moreover, note that 
if e were equal to 0(1/ log n), then the number of couples was a linear fraction of the number of 
singles. In face, our proof shows that the random market has a stable matching with probability 
at least 1 - (log n)° (1/e) /n n(e) > which converges to 1 even if e = SI (log log n / ^J\og n) , and not just 
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when e is constant. 

This means that we proved that the algorithm finds a stable outcome with probability ap- 



proaching 1 even when the number of couples grows like n /2V^- lo s lo g". Such growth is close 
to linear. Empirically it is indeed hard to distinguish between such subpolynomial factors and 
constant factors when there are n = 16, 000 doctors. 



A few open problems that follow from this work are the following. In Theorem 12 and its proof 
we used a large excess number of hospitals to obtain the negative result. To some extent we do 
not expect that fewer hospitals will improve the chances of obtaining a stable matching. Figure 
[3] suggests when there are an couples, the probability for the SoDA algorithm to find a stable 
matching decreases with a. We conjecture that this is true in general, i.e. the probability that 
there exist a stable matching (not necessarily found by SoDA) is decreasing with a. 
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